Readability of Japanese Electronic Text with Phrase-based Line Breaking
نویسندگان
چکیده
منابع مشابه
Text Readability and Word Distribution in Japanese
This paper reports the relation between text readability and word distribution in the Japanese language. There was no similar study in the past due to three major obstacles: (1) unclear definition of Japanese “word”, (2) no balanced corpus, and (3) no readability measure. Compilation of the Balanced Corpus of Contemporary Written Japanese (BCCWJ) and development of a readability predictor remov...
متن کاملAutomatic Assessment of Japanese Text Readability Based on a Textbook Corpus
Department of Electrical Engineering and Computer Science Graduate School of Engineering Nagoya University Chikusa-ku, Nagoya, 464-8603, JAPAN [email protected], {matuyosi,kondoh}@sslab.nuee.nagoya-u.ac.jp Abstract This paper describes a method of readability measurement of Japanese texts based on a newly compiled textbook corpus. The textbook corpus consists of 1,478 sample passages ex...
متن کاملAssessing Text Readability Using Cognitively Based Indices
Many programs designed to compute the readability of texts are narrowly based on surface-level linguistic features and take too little account of the processes which a reader brings to the text. This study is an exploratory examination of the use of Coh-Metrix, a computational tool that measures cohesion and text difficulty at various levels of language, discourse, and conceptual analysis. It i...
متن کاملText Document Clustering based on Phrase
Affinity propagation (AP) was recently introduced as an unsupervised learning algorithm for exemplar based clustering. In this paper novel text document clustering algorithm has been developed based on vector space model, phrases and affinity propagation clustering algorithm. Proposed algorithm can be called Phrase affinity clustering (PAC). PAC first finds the phrase by ukkonen suffix tree con...
متن کاملPhrase-Based Pattern Matching in Compressed Text
Byte codes are a practical alternative to the traditional bit-oriented compression approaches when large alphabets are being used, and trade away a small amount of compression effectiveness for a relatively large gain in decoding efficiency. Byte codes also have the advantage of being searchable using standard string matching techniques. Here we describe methods for searching in byte-coded comp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions of the Japanese Society for Artificial Intelligence
سال: 2015
ISSN: 1346-0714,1346-8030
DOI: 10.1527/tjsai.30.479